Syllabic quantity patterns as rhythmic features for Latin authorship attribution
نویسندگان
چکیده
It is well known that, within the Latin production of written text, peculiar metric schemes were followed not only in poetic compositions, but also many prose works. Such patterns based on so-called syllabic quantity, i.e., length involved syllables, and there substantial evidence suggesting that certain authors had a preference for over others. In this research we investigate possibility to employ quantity as base deriving rhythmic features task computational authorship attribution texts. We test impact these when combined with other topic-agnostic features. Our experiments, carried out three different datasets, using two machine learning methods, show are beneficial discriminating among authors.
منابع مشابه
Authorship Attribution Using Word Network Features
In this paper, we explore a set of novel features for authorship attribution of documents. These features are derived from a word network representation of natural language text. As has been noted in previous studies, natural language tends to show complex network structure at word level, with low degrees of separation and scale-free (power law) degree distribution. There has also been work on ...
متن کاملAuthorship attribution: using rich linguistic features
We describe here the technical details of our participation to PAN 2012’s “traditional” authorship attribution tasks. The main originality of our approach lies in the use of a large quantity of varied features to represent textual data, processed by a maximum entropy machine learning tool. Most of these features make an intensive use of natural language processing annotation techniques as well ...
متن کاملDeep Level Lexical Features for Cross-lingual Authorship Attribution
Crosslingual document classification aims to classify documents written in different languages that share a common genre, topic or author. Knowledge-based methods and others based on machine translation deliver state-of-the-art classification accuracy, however because of their reliance on external resources, poorly resourced languages present a challenge for these type of methods. In this paper...
متن کاملTowards Authorship Attribution for Bibliometrics using Stylometric Features
The overwhelming majority of scientific publications are authored by multiple persons; yet, bibliographic metrics are only assigned to individual articles as single entities. In this paper, we aim at a more fine-grained analysis of scientific authorship. We therefore adapt a text segmentation algorithm to identify potential author changes within the main text of a scientific article, which we o...
متن کاملA multitude of linguistically-rich features for authorship attribution
This paper reports on the procedure and learning models we adopted for the ‘PAN 2011 Author Identification’ challenge targetting real-world email messages. The novelty of our approach lies in a design which combines shallow characteristics of the emails (words and trigrams frequencies) with a large number of ad hoc linguistically-rich features addressing different language levels. For the autho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the Association for Information Science and Technology
سال: 2022
ISSN: ['1532-2882', '1532-2890']
DOI: https://doi.org/10.1002/asi.24660